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40 LEU 


TRP-78 


XX 


42 ASP 


LEU-87 




43 VAL 


GLN-90 




53 LEU 


ILE-78 


XX 


56PHE 


VAL-67 


XX 


67 ASP 


PHE-87 




80 LEU 


ILE-109 




92GLY 


PHE-106 


XX 


95 TCP 


GLN-101 


XX 



TABLE HI. Tertiary Constraint Lists for Atim 



Set of 62 
constraints 



Set of 50 
constraints 



Set of 37 
constraints 



'bad' 


2-228 


XX 


XX 




4-37 


XX 






4-206 


XX 


XX 


-1 15 


6-123 


XX 


XX 


6-89 








6-162 








7-248 


XX 


XX 




10-94 


XX 






11-64 


XX 


XX 




11-237 


XX 


XX 




15-46 


XX 


XX 


Si 20 


20-49 








23-237 




24-54 




27-59 








27-241 




32-59 




30-245 


XX 


XX 




36-58 


XX 


XX 




26-248 


XX 


41-91 


25 


37-89 


XX 


44-82 


39-123 


XX 




47-63 








47-87 








51-86 


XX 


XX 




59-245 


XX 


XX 




60-89 








63-90 


XX 




30 


66-79 




67-111 




68-114 


XX 






79-114 


XX 






89-162 




82-120 



Set of 62 
constraints 



91-125 
91-231 

93- 125 

94- 166 

95- 168 
98-126 
98-145 

105-145 
105-148 
109-152 
112-149 
112-161 
116-153 
127-145 

127- 165 

128- 142 
128-165 
130-175 
133-181 
142-165 

142- 189 

143- 192 
150-197 
155-200 
162-208 
165-189 
165-209 
183-225 
193-205 
215-244 



Set of 50 
constraints 



Set of 37 
constraints 



XX 
XX 
XX 

xx 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 

XX 



87-120 
90-122 



XX 
XX 



XX 
XX 

121-160 



XX 
XX 



XX 



XX 
XX 
XX 
XX 
XX 
XX 
XX 
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Results of Monte Carlo Simulated Annealing 

The results of stage 2 are compiled in Table IV, below. The numbers of 
constraints are given next to protein PDB codes. 14 An estimate of the cRMSD from 
the PDB structure and conformational energy (in dimensionless kgT units) is given 
for the last snapshot of each trajectory. The cRMSD is measured between the Ca's 
of the real structure and the roughly estimated position of the Ca's of the model 
chain. The latter are obtained according to the following definition: 

= (4i*i + rvi + r,+i)/6, where the sum in the brackets is over the corresponding 
side chain coordinates of the model chain. The exact agreement of the secondary 
structure of the predicted fold and the experimental structure was not examined in 
detail; however, in all runs, it was very close to the target with a small tendency for 
extension (by one or two residues) of helical fragments in some cases (e.g., the short 
helix of plastocyanin). The cRMSD and the energy (in dimensionless ksT units) 
correspond to the last snapshots of the second simulated thermal annealing runs. 

Generally, the predicted structures cluster into two well-defined groups, one 
of this dominates on the basis of energy, and which is taken to approximate native 
structure. The remaining, misfolded structures (when observed more than once) 
were also similar to each other. They represent the topological mirror structure 
where the chirality of the connections between secondary structural elements 
(helices and P-strands) is reversed, but the chirality of the secondary structure 
elements is the same as in the native state, e.g., helices remain right handed. Several 
interesting observations emerge from the results presented in Table IV, below. First, 
in the majority of the runs, the native fold is recovered. The accuracy depends on 
protein size and number of constraints, but only slightly on protein type. Generally, 
accuracy increases with decreasing protein size. The best accuracy is observed for 
the 56-residue, Bl domain of protein G, 41 where in most simulations the obtained 
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structures had cRMSD from native below 3 A. Interestingly, for the smaller 6pti 
fragment with a larger number of constraints, the accuracy was systematically 
somewhat worse. This reflects the effect of protein "regularity." The fold of protein 
G has a high content of regular secondary structure, while in the 6pti fragment, a 
substantial fraction of the chain is classified as a loop or coil. The analysis of other 
cases shows a tendency towards higher accuracy for more regular folds. The 
accuracy of helical and a/p proteins is greater than for all P-proteins. This is clearly 
demonstrated on comparison of lpcy with 2trx. While both proteins are of 
comparable size, for 2trx with 16 constraints, structures with a cRMSD below 3.5 A 
are produced, but for lpcy with 15 constraints, structures above 5.2 A result. 



hi: 



15 



20 



25 



30 



TABLE IV. Coordinate cRMSD and Conformational 
Energy of the Final Structure at the End of the 



Name 


Run no. 


cRMSD (A) 


Energy 


6pti(18) tt 


1 


3.3 b 


-321.9 


41 res 


2 


3.8 


-313.2 


(18-56 fragment) 


3 


4.1 


-302.8 


6pti(9/Sl) 


1 


4.1 


-336.4 


2 


4.2 


-345.4 




3 


3.6 


-318.9 




4 


3.8 


-385.9 


6pti(9/S2) 


1 


3.8 


-331.8 


2 


4.3 


-320.2 




3 


4.0 


-341.6 




4 


4.4 


-353.6 


6pti(9/S3) 


1 


3.4 


-303.1 


2 


4.0 


-318.7 




3 


4.8 


-324.5 




4 


MP 


-323.2 




5 


MI 


-322.5 


6pti(9/S4) 


1 


MI 


-319.1 


2 


3.8 


-312.8 




3 


4.0 


-320.8 




4 


4.2 


-280.4 




5 


4.1 


-302.0 


6pti(9/S5) 


1 


3.9 


-370.0 


2 


MI 


-324.7 
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